Add audio translation task type and provider #335

julien-nc · 2026-02-04T16:11:50Z

New audio2audio:translate task type (will be created in server soon)
Audio translation provider
Factorize translation logic in a service
Use correct user language for IL10N text translations happening in the task

…, factorize translation logic in a service use the correct user language for text translations happening in the task Signed-off-by: Julien Veyssier <julien-nc@posteo.net>

kyteinsky · 2026-02-04T18:04:33Z

lib/TaskProcessing/AudioToAudioTranslateTaskType.php

+			'text_output' => new ShapeDescriptor(
+				$this->l->t('Text output'),
+				$this->l->t('The text translation'),
+				EShapeType::Text,
+			),


maybe the transcribed text of the audio input can be sent back too, it's already computed.

kyteinsky · 2026-02-04T18:06:34Z

lib/TaskProcessing/AudioToAudioTranslateProvider.php

+	public function getOptionalInputShape(): array {
+		return [];
+	}


wdyt of adding the voice and speed optional params here too, same as the text-to-speech task?

integration_openai/lib/TaskProcessing/TextToSpeechProvider.php

Lines 64 to 84 in dfd4567

public function getOptionalInputShape(): array {

return [

'voice' => new ShapeDescriptor(

$this->l->t('Voice'),

$this->l->t('The voice to use'),

EShapeType::Enum

),

'model' => new ShapeDescriptor(

$this->l->t('Model'),

$this->l->t('The model used to generate the speech'),

EShapeType::Enum

),

'speed' => new ShapeDescriptor(

$this->l->t('Speed'),

$this->openAiAPIService->isUsingOpenAi(Application::SERVICE_TYPE_TTS)

? $this->l->t('Speech speed modifier (Valid values: 0.25-4)')

: $this->l->t('Speech speed modifier'),

EShapeType::Number

)

];

}

kyteinsky · 2026-02-04T18:08:05Z

lib/TaskProcessing/AudioToAudioTranslateProvider.php

+				$this->logger->warning('Text to speech generation failed: no speech returned');
+				throw new ProcessingException('Text to speech generation failed: no speech returned');
+			}
+			$translatedAudio = $includeWatermark ? $this->watermarkingService->markAudio($apiResponse['body']) : $apiResponse['body'];


maybe better to watermark the transcript of the input audio so the translated text and the translated audio both have the watermark in the target language

kyteinsky · 2026-02-04T18:13:18Z

lib/TaskProcessing/AudioToAudioTranslateProvider.php

+		if ($includeWatermark) {
+			if ($userId !== null) {
+				$user = $this->userManager->getExistingUser($userId);
+				$lang = $this->l10nFactory->getUserLanguage($user);
+				$l = $this->l10nFactory->get(Application::APP_ID, $lang);
+				$ttsPrompt .= "\n\n" . $l->t('This was generated using Artificial Intelligence.');
+			} else {
+				$ttsPrompt .= "\n\n" . $this->l->t('This was generated using Artificial Intelligence.');
+			}
+		}


this can also work but it would add the text/audio in the user's language which may or may not be the target language.

kyteinsky · 2026-02-04T18:15:12Z

lib/AppInfo/Application.php

+			$context->registerTaskProcessingTaskType(AudioToAudioTranslateTaskType::class);
+			$context->registerTaskProcessingProvider(AudioToAudioTranslateProvider::class);


checks for STT and TTS providers would be nice too

julien-nc requested review from kyteinsky and marcelklehr February 4, 2026 16:11

julien-nc added enhancement New feature or request 3. to review labels Feb 4, 2026

julien-nc changed the title ~~Add audio translation task type and provider…~~ Add audio translation task type and provider Feb 4, 2026

julien-nc force-pushed the enh/noid/audio-translation branch 3 times, most recently from f3a6a09 to 0c8d7be Compare February 4, 2026 16:28

feat(audio-translation): add audio translation task type and provider…

2e45e17

…, factorize translation logic in a service use the correct user language for text translations happening in the task Signed-off-by: Julien Veyssier <julien-nc@posteo.net>

julien-nc force-pushed the enh/noid/audio-translation branch from 0c8d7be to 2e45e17 Compare February 4, 2026 16:40

kyteinsky reviewed Feb 4, 2026

View reviewed changes

kyteinsky mentioned this pull request Feb 6, 2026

Translation provider prompt should be engineered a bit more to avoid outputting explanations #336

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add audio translation task type and provider #335

Add audio translation task type and provider #335

Uh oh!

julien-nc commented Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	public function getOptionalInputShape(): array {
	return [
	'voice' => new ShapeDescriptor(
	$this->l->t('Voice'),
	$this->l->t('The voice to use'),
	EShapeType::Enum
	),
	'model' => new ShapeDescriptor(
	$this->l->t('Model'),
	$this->l->t('The model used to generate the speech'),
	EShapeType::Enum
	),
	'speed' => new ShapeDescriptor(
	$this->l->t('Speed'),
	$this->openAiAPIService->isUsingOpenAi(Application::SERVICE_TYPE_TTS)
	? $this->l->t('Speech speed modifier (Valid values: 0.25-4)')
	: $this->l->t('Speech speed modifier'),
	EShapeType::Number
	)
	];
	}

		$context->registerTaskProcessingTaskType(AudioToAudioTranslateTaskType::class);
		$context->registerTaskProcessingProvider(AudioToAudioTranslateProvider::class);

Add audio translation task type and provider #335

Are you sure you want to change the base?

Add audio translation task type and provider #335

Uh oh!

Conversation

julien-nc commented Feb 4, 2026

Uh oh!

kyteinsky Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

kyteinsky Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

kyteinsky Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

kyteinsky Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

kyteinsky Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants